-
Notifications
You must be signed in to change notification settings - Fork 7.9k
simplify unpack logic #6908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
simplify unpack logic #6908
Conversation
- move endiannes check to compile time - remove php_unpack function - the compiler take care of sign extension
ext/standard/pack.c
Outdated
#define IS_LITTLE_ENDIAN (*(unsigned char *)&(uint16_t){1}) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't you reuse the WORDS_BIGENDIAN
macro which is defined in Zend/zend_types.h ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure can.
|
||
if (type == 'q') { | ||
v = (zend_long) v; | ||
v = (int64_t) x; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Far from being a C expert but isn't the following:
v = (int64_t) x; | |
v = *(int64_t*) &x; |
the correct way of type-punning? As IIRC signed/unsigned casts are UB if the value exceeds the range of a type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For casts between integers, the existing code is fine. What you may have in mind is a bitwise cast between float and int.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, I double checked the standard and apparently unsigned to signed is UB when the value doesn't fit, but almost all compiler do two complement (according to https://unspecified.wordpress.com/category/c-3/ and if I read this correctly).
So should only be an issue on unicorn compilers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The standard (6.3.1.3 §3) mentions this conversion produce a result which is implementation-defined, not UB.
Is it OK to keep it like this?
ext/standard/pack.c
Outdated
map = little_endian_short_map; | ||
v = (int16_t) x; | ||
} else if ((type == 'n' && MACHINE_LITTLE_ENDIAN) || (type == 'v' && !MACHINE_LITTLE_ENDIAN)) { | ||
v = php_pack_reverse_int32(x) >> 16; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add reverse_int16?
ext/standard/pack.c
Outdated
case 'v': { /* unsigned little endian */ | ||
zend_long v = 0; | ||
uint16_t x; | ||
memcpy(&x, &input[inputpos], 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be nicer to load through unaligned_uint16_t
instead? (Grep for existing typedefs...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean like this: uint16_t x = *((unaligned_uint16_t*) &input[inputpos])
Isn't in C access to char array through pointer of any other type forbidden?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't really care about strict aliasing as long as it's not actively miscompiled ;) I'm okay with keeping the memcpy though, I expect it to compile down to about the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok then, I'll change it.
|
||
if (type == 'q') { | ||
v = (zend_long) v; | ||
v = (int64_t) x; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For casts between integers, the existing code is fine. What you may have in mind is a bitwise cast between float and int.
Sorry, I forgot to merge this :( |
This is the first PR in an effort to speed up unpack function.
So far it mainly simplifies the code. There's also a small speedup by not doing byte-wise unpacking by php_unpack.